TIPS: A Translingual Information Processing System

نویسندگان

  • Yaser Al-Onaizan
  • Radu Florian
  • Martin Franz
  • Hany Hassan
  • Young-Suk Lee
  • J. Scott McCarley
  • Kishore Papineni
  • Salim Roukos
  • Jeffrey S. Sorensen
  • Christoph Tillmann
  • Todd Ward
  • Fei Xia
چکیده

Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, which has been decreasing as a fraction the increasing population of web users, dipped below 50% in the summer of 2001). To improve the ability of an English speaker to search mutlilingual content, we built a system that supports cross-lingual search of an Arabic newswire collection and provides on demand translation of Arabic web pages into English. The cross-lingual search engine supports a fast search capability (sub-second response for typical queries) and achieves state-of-the-art performance in the high precision region of the result list. The on demand statistical machine translation uses the Direct Translation model along with a novel statistical Arabic Morphological Analyzer to yield state-of-the-art translation quality. The on demand SMT uses an efficient dynamic programming decoder that achieves reasonable speed for translating web documents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Translingual Information Management by Natural Language Processing

Preface Translingual information management that can choose appropriate information and obtain useful knowledge from the ood of global information is being increasingly demanded. Among the related technologies, natural language processing is one of the most promising for meeting information needs because natural language processing can deal with documents that have essential roles in informatio...

متن کامل

Towards Translingual Information Access Using Portable Information Extraction

We report on a small study undertaken to demonstrate the feasibility of combining portable information extraction with MT in order to support translingual information access. After describing the proposed system's usage scenario and system design, we describe our investigation of transferring information extraction techniques developed for English to Korean. We conclude with a brief discussion ...

متن کامل

Site Method MIR TIR TIR / MIRCMU

We present an attempt at a coherent vision of an end-to-end translingual information retrieval system. We begin by presenting a sample of the broad range of possibilities, and the results of some initial work comparing the diierent approaches. We then present an overall workstation architecture, followed by two possible approaches to the actual translingual IR stage presented in detail. Ranking...

متن کامل

Translingual Information Retrieval: Learning from Bilingual Corpora

Translingual information retrieval (TLIR) consists of providing a query in one language and searching document collections in one or more diierent languages. This paper introduces new TLIR methods and reports on comparative TLIR experiments with these new methods and with previously reported ones in a realistic setting. Methods fall into two categories: query translation and statistical-IR appr...

متن کامل

Translingual Information Retrieval: Learning from Bilingual Corpora (ai Journal Special Issue: Best of Ijcai-97)

Translingual information retrieval (TLIR) consists of providing a query in one language and searching document collections in one or more diierent languages. This paper introduces new TLIR methods and reports on comparative TLIR experiments with these new methods and with previously reported ones in a realistic setting. Methods fall into two categories: query translation and statistical-IR appr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003